COVID-2019 Brief Data Analysis

Introduction

Our domain of interest is the impact of COVID-19 in the US. This spring, we have this unfortunate outbreak of virus that has threathen hundreds of thoudsands of lives int the U.S., and even more around the world. Our group hopes that, with our effort and analysis on multiple datasets about the COVID-19, we can arouse more viligance and help people to understand this disease better.

The first dataset is a comprehensive one about each U.S county, which collects information related to their weather, socio/health and COVID-19 situation. Since its size exceeds the upload limit, we decided to keep it local.

We also found a dataset about provisional COVID-19 Death Counts based on states, sex and ages that could help us understand the bigger picture.

Finally, the dataset US State COVID-19 Daily collects number of daily cases in USA, and we have a helper dataset that stores the latitude and longitude values of states and country to assist our charts.

Summary Infomation

A Summary Table

<<<<<<< HEAD

Our summary table sumamrizes the total number of testings conducted in the U.S. It also demonstrate the number of positive and negative cases, along with the percentage of positive cases over the total number of testing in the states. The data is grouped by states so that we can compare the different situations among the states. The table is sorts descendently by the percentage of positive cases over all the testings in each state. From the table, we can conclude that New Jersy has the highest percentage for positive COVID-19 cases among all the states. We should notice that even though New York has a significantly higher number of testing conducted than New Jersy, the percentage for positive cases in NY is lower than the NJ. Another insight is some states have not conducted enough testing for COVID-19.

======= >>>>>>> 9f187aebecf25e0a0bbec9018662d1b22c7f00d2
State Total Positive Cases Total Negative Cases Total Testings Percent of Positive Cases
NJ 140742 292317 433059 32.50
NY 338479 886580 1225059 27.63
CT 34333 104091 138424 24.80
DC 6485 24559 31044 20.89
DE 6741 26540 33281 20.25
MD 34061 135425 169486 20.10
PR 2294 9304 11598 19.78
MA 79324 322164 401488 19.76
PA 57989 237989 295978 19.59
CO 19879 88759 108638 18.30
NE 8572 39354 47926 17.89
IL 83017 388546 471563 17.60
IN 25126 125383 150509 16.69
VA 25800 129511 155311 16.61
IA 12912 68361 81273 15.89
MI 48012 259869 307881 15.59
SD 3663 21529 25192 14.54
LA 32050 195962 228012 14.06
GA 34633 227544 262177 13.21
KS 7116 46989 54105 13.15
RI 11613 83625 95238 12.19
OH 25250 192474 217724 11.60
MN 12494 108304 120798 10.34
MS 9908 87784 97692 10.14
NV 6310 57750 64060 9.85
AZ 11734 111079 122813 9.55
NH 3158 32391 35549 8.88
WI 10610 112729 123339 8.60
SC 7927 85208 93135 8.51
MO 10006 111290 121296 8.25
AL 10310 122908 133218 7.74
NC 15345 186898 202243 7.59
TX 39868 485828 525696 7.58
FL 41921 537657 579578 7.23
ID 2260 30418 32678 6.92
WA 17121 234986 252107 6.79
CA 69329 963526 1032855 6.71
KY 6677 97340 104017 6.42
ME 1477 22091 23568 6.27
AR 4164 66274 70438 5.91
VI 68 1115 1183 5.75
TN 16110 267713 283823 5.68
OK 4731 91379 96110 4.92
NM 5069 101636 106705 4.75
WY 675 14384 15059 4.48
VT 927 20327 21254 4.36
OR 3283 74291 77574 4.23
UT 6431 147053 153484 4.19
GU 149 3916 4065 3.67
ND 1571 46261 47832 3.28
WV 1371 63697 65068 2.11
MT 461 22563 23024 2.00
HI 633 37305 37938 1.67
AK 383 29570 29953 1.28
MP 19 2854 2873 0.66
AS 0 105 105 0.00

Charts

<<<<<<< HEAD

Our first chart is a bar chart that reflects the total hospitalized people in each state up until the lastest day the data is collected. We include this chart becasue we would like to compare the cases between states and understand the general situation the U.S is having right now. From this bar chart we could see that New York State has far more identified cases than any other states. While other states all have fewer than 10,000 cases, New York has over 73,000.

Our second chart is a map that uses circle to locate each state, and shows the positive cases of COVID-19 with the size of the circle. We included this chart becasue we would appreciate a map that can shows the geometric location and visualize the postive cases number at the same time. From this map, we can tell that the Northeast ara suffers the most, and California has quite a few cases compared to other west states. According to common knowledge, we see that the more populated is the state, the more cases it has, which is also shown by the fact that the Midwest generally has fewer cases.

Our third chart is a scatter plot that shows the daily positive COVID-19 cases from this January to mid-May. We includ this plot becasue we would like to learn the basic trend of the growth of postive cases in the U.S. We sadly found that up until the middle of March, the growth is controlled and stedy. Starting from the late March, the growth of positive cases has increased from less than 500 to almost 190,000, and the number has not been decreased at all. We intrepret that this result can be due to there were more test after late March.

=======

Our first chart is a bar chart that reflects the total hospitalized people in each state up until the lastest day the data is collected. We include this chart becasue we would like to compare the cases between states and understand the general situation the U.S is having right now. From this bar chart we could see that New York State has far more identified cases than any other states. While other states all have fewer than 10,000 cases, New York has over 73,000.

Our second chart is a map that uses circle to locate each state, and shows the positive cases of COVID-19 with the size of the circle. We included this chart becasue we would appreciate a map that can shows the geometric location and visualize the postive cases number at the same time. From this map, we can tell that the Northeast ara suffers the most, and California has quite a few cases compared to other west states. According to common knowledge, we see that the more populated is the state, the more cases it has, which is also shown by the fact that the Midwest generally has fewer cases.

Our third chart is a scatter plot that shows the daily positive COVID-19 cases from this January to mid-May. We includ this plot becasue we would like to learn the basic trend of the growth of postive cases in the U.S. We sadly found that up until the middle of March, the growth is controlled and stedy. Starting from the late March, the growth of positive cases has increased from less than 500 to almost 190,000, and the number has not been decreased at all. We intrepret that this result can be due to there were more test after late March.

>>>>>>> 9f187aebecf25e0a0bbec9018662d1b22c7f00d2